Journal article
On the e iciency of k-means clustering: Evaluation, optimization, and algorithm selection
S Wang, Y Sun, Z Bao
Proceedings of the VLDB Endowment | ASSOC COMPUTING MACHINERY | Published : 2020
Abstract
This paper presents a thorough evaluation of the existing methods that accelerate Lloyd’s algorithm for fast k-means clustering. To do so, we analyze the pruning mechanisms of existing methods, and summarize their common pipeline into a uni ed evaluation framework UniK. UniK embraces a class of well-known methods and enables a ne-grained performance breakdown. Within UniK, we thoroughly evaluate the pros and cons of existing methods using multiple performance metrics on a number of datasets. Furthermore, we derive an optimized algorithm over UniK, which e ectively hybridizes multiple existing methods for more aggressive pruning. To take this further, we investigate whether the most e cient m..
View full abstractGrants
Awarded by Google
Funding Acknowledgements
Zhifeng Bao is supported by ARC DP200102611, DP180102050 and a Google Faculty Award.